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STATE-DEPENDENT RICCATI EQUATION REGULATION 
OF SYSTEMS WITH STATE AND CONTROL 
NONLINEARITIES 

Scott C. Beeler* 


ABSTRACT 

The state-dependent Riccati equation (SDRE) is the basis of a technique for suboptimal 
feedback control of a nonlinear quadratic regulator (NQR) problem. It is an extension 
of the Riccati equation used for feedback control of linear problems, with the addition 
of nonlinearities in the state dynamics of the system resulting in a state-dependent 
gain matrix as the solution of the equation. In this paper several variations on the 
SDRE-based method will be considered for the feedback control problem with control 
nonlinearities. The control nonlinearities may result in complications in the numerical 
implementation of the control, which the different versions of the SDRE method must 
try to overcome. The control methods will be applied to three test problems and their 
resulting performance analyzed. 

1 INTRODUCTION 

The nonlinear quadratic regulator (NQR) problem can be approached in many different 
ways, using different approximation techniques or numerical schemes to obtain a feedback 
control as near to optimal as possible. A direct formula for an optimal feedback control is 
almost always impossible for nonlinear systems, so several suboptimal methods have been 
proposed in the literature. Among these is the method involving the state-dependent Riccati 
equation (SDRE), based on the constant-valued Riccati equation used to find the optimal 
feedback control for linear systems. The SDRE extends this approach to the nonlinear case 
by allowing the matrices involved to be functions of the state variables, and possibly the 
controls as well. The SDRE method has the advantage of being fairly simple numerically in 
comparison to many other techniques, as well as being closely related to the well-understood 
Riccati equation for linear problems. 

Early work on the state-dependent Riccati equation was done by Pearson [1], Garrard, 
McClamroch and Clark [2], Burghart [3], and Wernli and Cook [4], More recently, the 
SDRE control has been studied by Krikelis and Kiriakidis [5], and Cloutier, D’Souza and 
Mracek [6, 7] (with the SDRE applied to a nonlinear benchmark problem in [8]). Hammett, 
Hall and Ridgely look at controllability issues for the SDRE in [9]. A comparison study 
involving the SDRE among other nonlinear control methods was done in [10], and tracking 
control and state estimation methods using the SDRE were developed in [11], 

Most of these methods assume a system in a control-affine form such as 

x = f(x) + B(x)u, (1) 
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where there are state nonlinearities but only linear dependence on the control inputs. Of 
the above cited papers, only Wernli and Cook [4] consider the more general case 

x = f(x,u). (2) 

This type of system leads to significant complications in the SDRE method control equations, 
making it much more difficult to find a direct formula for u in terms of x. However, since 
there are many applications where the control inputs are nonlinear, we wish to extend the 
state-dependent Riccati equation method to be applicable to the wider case of equation (2), 
and so in this paper we will be studying various techniques to allow control of this more 
complex problem. 

The approach of Wernli and Cook is to split the state and control dynamics matrices into 
constant and variable parts, and then use those to solve for a truncated power series approx- 
imation to the gain matrix solution of the SDRE. This allows the numerical calculations 
to be comparatively simple and mostly done offline prior to the application of the control 
to the system. This approach was also used for problems with nonlinear state dynamics in 
[10], and will be one of the techniques used in this paper. An alternate approach is to do 
most of the calculations online, solving the SDRE repeatedly to update the control as the 
state changes. This has the advantage of solving the exact equation rather than an approxi- 
mation, but it must do so constantly during the control implementation rather than a single 
time prior to the start of the problem. One way of working around control nonlinearities is 
to move them into the state dynamics by reformulating the system with a cheap control (or 
integral control) approach, as suggested in [12]. There are some numerical complications 
to these techniques which may favor one over the others in certain circumstances, as will be 
discussed later in this paper. 

A summary of the derivation and nature of the state-dependent Riccati equation and its 
use for nonlinear feedback control problems will be given in Section 2. The two primary 
numerical realizations of the nonlinear-w SDRE to be considered here will then be described: 
the power series approach in Section 3 and the online control update approach in Section 4. 
Several versions of these control formulations will be applied to three example problems with 
the results discussed in Sections 5-7. An evaluation of the SDRE techniques and overall 
conclusions will be given in Section 8. 


2 BASIS OF THE STATE-DEPENDENT RICCATI EQUATION 

The state-dependent Riccati equation (SDRE), as part of a suboptimal feedback control 
for nonlinear systems, has its roots in the optimality conditions for the NQR problem. 
This parallels the constant- valued algebraic Riccati equation’s status in finding the optimal 
feedback control for a given linear quadratic regulator (LQR) problem. The nonlinear 
system we will study is of the form 

j x(t) = f(x(t)) + g(x(t),u(x(t))) , . 

1 x(0) = x 0 , W 


with x G R m and u : R m — y R k . The objective in this problem is to find the feedback 
control u(x) which minimizes the quadratic cost functional 


J(xo, u) = ^ f (x t Qx + u t Ru) dt 
Z j 0 


( 4 ) 


2 



with given const ant-valued state and control weight matrices Q (symmetric, positive semi- 
definite) and R (symmetric, positive definite). 

The SDRE is derived from the optimality conditions on the Hamiltonian for this problem, 
which is defined as 

U[x, u,p ) = ^ x t Qx + ^u T Ru + p T (f(x) + g(x, «)) . (5) 

From the Hamiltonian the necessary conditions for the optimal control of the dynamic system 
and cost functional in equations (3-4) are given in terms of x, u, and the costate p E R m by 

x = - X - = f(x)+g(x,u ) ( 6 ) 

p = -^ = - Qx -^-{ x ) p -^-( x , u ) p ( 7 ) 

0 = l£ = Ru + % 7 {x ’ v)p - (8) 

(See references [13, 14], for example.) From equation (8), the control can be written as 


da T 

u(t) = -R 1 -^-( x (t),u(t))p(t). (9) 

In a linear problem, where f(x) = A 0 x and g(x,u ) = B 0 u, we would look for a costate 
solution of the form p(t) = U 0 x(t). We would take the time derivative of this costate formula, 
substitute in equations (6,7,9) and the chosen formula for p, and obtain a linear feedback 
control u(x) = — R~ 1 BqU 0 x 1 with the gain n 0 given by the solution to the const ant-valued 
Riccati equation 

n 0 7Lo + Aq no — n 0 B 0 R 1 H^no + Q = o, (10) 

which is well understood and for which good solvers are available. 

For the more general nonlinear case the SDRE method mimics the above approach by 
rewriting the functions in (3) in quasi-linear form as f(x) = A(x)x and g(x,u ) = B(x,u)u. 
(Note that the choices of the matrix functions A and B are not unique, and different sub- 
optimal SDRE controls will result from using different choices of A and B in the manner 
described below.) This “quasi-linearization” rewriting / and g has also been called an 
“apparent linearization” , “extended linearization” , and “state-dependent coefficient param- 
eterization” in various places. With these new forms of / and g 1 we seek a costate of the 
form 

p(t) = U(x(t),u(t))x(t). (11) 

With a substitution of the above p and g formulas into equation (9), the control equation 
becomes 


u 


(x) = — R 1 B T (x,u)U(x,u)x — R ~2 u i( ~ If (x,u)x. 


( 12 ) 


i = 1 


The .B-column derivative with respect to u, 

( dBu/dui 


dB 


1 — >m,i 


du 


\ dB mi /dui 


dBu/duk \ 
dB mi /du k j 


(13) 


3 



will be considered small and discarded in finding the basic SDRE control, in order to simplify 
the calculations involved and keep the form of the control analogous to the const ant-valued 
Riccati equation. 

Taking the derivative of equation (11) yields 

p = H (x,u) x + [D t U (x,u)\x, (14) 


where the term D t H{x,u) represents the total time derivative of II (x(t), it(x(t))) given by 

(15) 


m dU k dB 

D t Yl(x, u)=Y^ q^t( x - u ) x i + Y “)“<■ 


i= 1 


dm 


We then substitute into equation (14) the formulas for x, p 1 u and p determined by equations 
(6,7,9,11), along with the quasi-linear formulas for / and g given above. This results in 


n (x, u ) 


A(x)x — B(x,u)R 1 B t (x , u) + 


'dB 


U; 


1 ~^m,i 


i = 1 


du 


(x,u) B(x,u)x 


+ [D t U(x,u)\x = -Qx - A T (x) + Xi { dAl ^ c m,i {x)) 


'dB i_ 


dx 


+ E‘ = i«i( 

where the A- and B-column derivatives are given similarly to (13) by 

/ dAu/dxx ■■■ dAu/dx m \ 


(x, u )) T n (x,u)x, (16) 


dA 


1 — 


dx 


\ dA m i/dx\ • • • dA m i/dx m J 

( dBu/dxi ••• dBujdxm \ 


dx 


(17) 


(18) 


V dB mi /dx 1 • • • dB mi /dx m J 
By rewriting equation (16) and factoring out x, we obtain 

[ll(x, u)A(x) + H T (x)n (x, u ) — n(x, u)B(x , u)R~ l B T (x, u)Tl(x, u) + Q] 

dB 1 — yjji i \ / \ 

■{x,u)\ 1 l(x,it) 


+ 


m 
i = 1 


0 A { J n(x,M) + ^Wi 


dx 


k 

Y 

i = 1 


dx 


— n (. x , w) B(x. u)R 1 y iq 

*=i 


as 


1— 


du 


(x,u) n (x, it) + _D t n (x, it) 


0, (19) 


The first part of this equation is the state-dependent Riccati equation, and the second part 
is a set of extra terms which, as in equation (12), will be assumed to be small and eliminated. 
These modifications mean that the SDRE will find only a suboptimal control. How sig- 
nificant the difference will be depends on the specific problem, with the greater divergence 
from the linearized form generally resulting in further-from-optimal controls. With the 
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elimination of the derivative terms, the SDRE control will be given by the following Riccati 
equation and control formula: 

fl(x, u)A(x) + A T (x)II(x, u) — II(x, u)B(x , u)i? _1 i? T (x, w)fl(x, u) + Q = 0, (20) 

u(x) = — R~ 1 B t (x 1 u)U(x, u)x. (21) 

3 POWER SERIES FORMULATION 

The nature of the formulation in equations (20-21) creates some significant difficulties in 
finding a solution, specifically that the Riccati equation (20) is dependent on both the state 
and the control, and that the control equation (21) is not a direct expression for u either. A 
direct algebraic solution of these equations is impossible for virtually all cases, so we must 
find a good means of approximation or numerical solution of the equations. 

One approach, from Wernli and Cook [4], is to take a power series expansion for II (x) in 
terms of a temporary variable e: 

OO 

II (x,u,e) = ^ £ 3 L,j(x. u), (22) 

3=0 

then split A and B into constant and variable parts: 

A(x, e) = A 0 + eAA(x), B(x,u,e) = B 0 + eAB(x,u), (23) 

and substitute these into the state-dependent Riccati equation (20). Proceed to separate 
out by powers of £ and solve the resulting series of equations for as many Lj terms as desired: 

LqAq + Aq Lo — LoBqR l BqLo + Q = 0 (24) 

L x (Ao - BoR^B^Lo) + (at - LoBoR^B^) L, + L 0 AA + A A T L 0 

— L 0 (BoR- 1 AB t + ABR~ 1 Bq') Lq = 0 (25) 

Lj (A 0 - BoR^BTLo) + (A t 0 - LoBoR^B^) Lj + Lj_ x AA + A A T Lj_ x 
j - i j - i 

- ^ (. LiBoR-'B^Lj-i ) - £ Li (BoR-'ABT + A BR^B?) Lj _i _ 4 
i=l i = 0 

3 ~ 2 

-Y J L i ABR- 1 AB T Lj-2-i = 0. (26) 

i = 0 

These Lj matrices are then substituted into the control equation (with e set to 1), which for 
N p nonlinear terms becomes 

N p 

u(x) = —R~ l B t (x, u ) ^ Lj(x, u)x. (27) 

j=o 

Power series approaches are also described in [2, 3, 5], but not for systems containing control 
nonlinearities. 

Note that increasing the number of power series terms used will not necessarily improve 
the performance of the control algorithm. For one thing, even an exact solution of the SDRE 
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produces only a suboptimal control, so a closer approximation to the solution may not result 
in a closer-to-optimal control. For another, the performance will vary as the distance from 
the expansion point varies; in particular, farther from this point the higher order terms will 
likely become less accurate and end up detracting from the effectiveness of the control. 

In most cases with only state (not control) nonlinearities the computations in the power 
series expansion are relatively easy to do, since in that situation B 1 n, and Lj are dependent 
only on x. This allows all the major numerical work to be performed offline, leaving just 
the direct evaluation of the Lj(x ) functions at the current state value to be done during the 
control implementation on the subject system. With nonlinear control inputs, however, this 
approach is more difficult. For simple functions of u (such as B(x,u ) = Bi{x) + B 2 (x)u ) 
and a small number of power series terms, it is possible to rewrite equation (27) as a low- 
order polynomial in u and solve this online for the control value at the current x. Choosing 
which root of the polynomial to use is nontrivial, however, and in this implementation it has 
been done both by using the root with the smallest absolute value (to minimize excessive 
control use) and by using the root closest to that in the previous time step (to maintain the 
continuity of the control inputs as much as possible). Alternatively equation (27) can be 
solved iteratively, also online, for u: 


U(n+l){x) =~R 1 B T (x i U {n) ) J2 L j( X ,U(n))x. 

3=0 


(28) 


The starting point U( 0 ) for this iteration process would be a chosen value such as u at the 
previous time step, or« = L 0 x, or u = 0. 

Another approach is to change the structure of the problem by using a cheap control 
formulation, shifting the control variables in the original problem into additional state vari- 
ables, and using the derivatives of the original controls as the new controls. This approach 
is also proposed in [12]. The new system is written as: 



A(x) 

0 


B(x , u) 
0 



0 

h 


«, 


where I k is a k x k identity matrix. The cost functional is rewritten as 




Q 0 
0 R 


x 

u 



dt. 


(29) 


(30) 


The weight R on the new control vector u is set very close to zero to keep this new cost func- 
tional close to the original, but must also not be set too small or else computational problems 
will result. Setting R = 0 exactly would in theory keep the system dynamics the same as 
the original, but this would make the implementation of feedback control formulas such as 
equations (20-21) impossible, as they would then contain an undefined jR - 1 . Extremely 
small but nonzero 4 _1 values will create problems as well. The cheap control approach 
makes the control problem computationally simpler by combining all the nonlinearities into 
the state dynamics and leaving the new control inputs linear. With the new B = [0 I k ] T not 
dependent on u, the calculation of the control using the Lj matrices can be done without 
the need for a polynomial solution or iterative process. 
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The power series method has the advantage that most of its calculations are done offline, 
with only the polynomial solution or iteration for u (both of which are fast) performed during 
the control implementation. However, the control may be less optimal than desired, since 
the SDRE generates a suboptimal form of the control to begin with, and the use of a power 
series results in a further approximation. The best number of power series terms to use is 
also difficult to determine, and may vary from application to application. 

4 ONLINE CONTROL UPDATE FORMULATION 

As an alternative to the power series approach discussed in the previous section, we can use 
a type of method which does most of the numerical calculations online, during the control 
implementation. The control equations (20-21) can be solved repeatedly as the system 
evolves to update the gain n and control u: 

bln+i A.(xn) T A (x n )n n _)-i Tl n +iB(x n , u n )R B (x ni "U n )n n _|_i + Q — 0, (31) 

ffii+l — R B B-n+lX n . (32) 

(Here x n and u n give the state and control values at time step n .) These evaluations can be 
done a single time, or, in an effort to obtain a more accurate solution, in an iterative loop 
back and forth between the two equations given by 

ffn+i ,k-A(xn) T A (x n )n n _|_i,A: B n j r \^B(x n ^u n j r \^k)R B (x n , 'U r j-|_i ; ^)n n ^_i^ T Q — 0, (33) 

'u , n+i,k+i — B B ( x n , u n+hk )U n+hk x n , (34) 

starting from an initial iterate such as M n +i,o = u n . (Here n is the time step number and k 
is the iteration number at the current time step.) 

In implementation on a physical system, where the control calculations must be done in 
real time, the gain and control updates must be done only at some discrete intervals to allow 
time for the computations to be done. This will produce a less effective control than with 
constant updates, but if the control can be reevaluated quickly enough this approach may 
still result in better performance than the power series method, since it solves the Riccati 
equation exactly, while the power series finds an approximate solution. For investigating 
the performance of the control method on a simulated problem, however, the updates can 
be done at every time step in the ODE solution (as will be done on the test examples in the 
sections to follow). 

While deriving the power series approximation in equations (24-27), the extra derivative 
terms of A, B and n are dropped from equations (19-12) so that the control formula remains 
in Riccati equation form and is more easily solveable. However, when updating the control 
online, as in equations (31-32) or (33-34), as a variation on the control method we can retain 
the extra derivative terms and instead solve 

ffn+i-4(^n) T A (x n ^)TL n ^i TL n ^-iB (x ni u n ^R B (x n , u n ^TL n ^i T Q 
, > (3A^ m , i,_ . A.. , (dB^ m , i,_ .. 

“1“ / A %n)i ^ \%n) J n n _^i ~h / A ^nJi ^ J n n _|_i 

- n n+1 B(x n , , » rt )R~ 1 u n )i ( x n , n n+ i + D t U (x n , Un) = 0, (35) 
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^n + 1 — R B (x n: 'U n )n n +iX n ^ (^r 75 ^ 77 )^ n n _|_iX n (36) 

(or an iterative form similar to equations (33-34) but including the extra derivative terms). 
One simple way to obtain a numerical approximation of the time derivative of II in this 
equation is with the difference formula 

A n {x n , u n ) = (n n+1 - n„) / (t n+ i - t n ) . (37) 

Since equations (35-36) are solved online, the previous value of II will be available so that this 
calculation can be done, while with the offline power series calculations there is no way to 
evaluate An and so it must be assumed small and ignored. Equation (35) obviously cannot 
be solved with an algebraic Riccati equation solution algorithm, but must be solved using- 
some more general zero-finding technique for nonlinear systems (such as a Newton- Raphson 
algorithm). 

As an further control variation, if we are using a general zero-finding algorithm, we can 
choose to apply it to the combined system, 


n?7-i- 1 A {x n ) -t- A (;r n )II n _|_i Tl n +iB (x ni u n ^-i^)R B (x n , w^+OkR+i T Q 


+ y^X x n)i 


dAi^ m ^ 

dx 


{%n) ) n n _|_i + ^ y (^n+l)i 


-{x n ^u n j r \j n n _)_x 


B. n -\.\B{x n ^U n -i r \'jR ^ ) (^n+l)i 


-'1—7777,7 

Du 


{x-n , u n - (_i) n n _|_i T Df n ( x ni u n+ i) — 0, (38) 


^77+1 "b R B (*^775 '^77+l)n77+l^'77 R ^ ( (^ 77 + 1)7 


(2-775^77+1) fif-77+ 1 2-77 — 0, 


finding both the II and u updates simultaneously instead of first finding II in (35) and then 
updating u in (36). 

An approach using cheap control, as described in equations (29-30), is also an option 
with control through online gain matrix evaluations. The idea again is that restructuring 
the problem may decrease the effects of potential pitfalls due to the control nonlinearities, 
while hopefully not altering the problem so much that the resulting control is ineffective for 
the original system. This may be particularly useful when dealing with a hard-to-control 
scenario such as the one described below. 


One aspect of nonlinear-w dynamical systems which presents an obstacle to feedback 
control implementation is related to the fact that there may be values of u which result 
in the control-dependent B matrix becoming zero. The system becomes uncontrollable at 
the specific point where B( x) = 0, and the control formula in equation (21) is reduced to 
u(x ) = 0. However, this situation can have effects on some of the formulas we are studying 
even away from the zero point itself. Examples One and Two in the sections to follow are 
problems which contain a zero-valued B for certain u values. This results in a discontinuity 
in the u update iteration generated by the online-update SDRE algorithm in equations (33- 
34). Depending on the specifics of the example involved, this may cause major problems 
in obtaining convergence of that algorithm (and the related algorithms in (35-36) and (38- 
39) which involve iterative processes as well) in some region around the zero point. This 
difficulty will be illustrated on Example One in the next section. 
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The effectiveness of the power series method also may be impaired by an uncontrollable 
B(u) = 0 point if the iterative form of the method in equation (28) is used, since its conver- 
gence may become uncertain. In contrast, other control techniques may not be affected by 
this behavior; the polynomial form of the power series method and also the cheap control 
version of the method should be more likely to succeed in spite this obstacle, as they are 
based on direct evaluation of equations instead of an iterative process, (as well as finding 
their control values through an approximation rather than the original system itself). A 
cheap control version of the online-evaluation method will be working with a modified ver- 
sion of the system, but will still be using an iterative evaluation process which may have 
difficulties in converging. 

In sum we have a number of different algorithms which are variations on the general 
method using the state-dependent Riccati equation. Those in this section are based on 
the re-evaluation of the SDRE over and over during the control process, while those in the 
previous section are based on using a power series approximation of the SDRE to simplify the 
control evaluation so that it can be done primarily offline. Some of these variations may be 
more effective than others at controlling systems with both state and control nonlinearities, 
and overcoming difficulties created by situations like the zero point discussed above. This 
will be studied in the example simulations in the following sections. 

5 EXAMPLE PROBLEM ONE 

The control algorithms discussed here were implemented in Matlab programs on a set of 
example problems to test their ability to control nonlinear-tt systems. The first of these 
examples is a modified version of a model of the flight dynamics of a high-performance 
aircraft, taken from Garrard, Enns and Snell [15]. The model contains five state variables 
representing the flight conditions of the aircraft: x\ is the deviation of the velocity from 

its trim value (given in units of (100m)/s), x 2 is the deviation of the angle of attack from 
trim (in radians), x 3 is the pitch rate (rad/s), x 4 is the flight path angle (radians), and x 5 is 
the deviation of the canard (control flap) deflection angle from trim (radians). The single 
control u is the input canard deflection in radians. 

The model is quadratically nonlinear in the state variables, and we have added a (non- 
physical) quadratic nonlinearity (involving B 2 as seen below) to the control dynamics to test 
the ability of the algorithms to stabilize a system of this type. The system is given by 

x = (A 0 + x 2 Anl) x + ( B\ + uB 2 ) u, (40) 

where the matrices A 0 , A NLl B 4 and B 2 are all constant-valued, and are given by: 

' -0.0443 1.1280 0.0 -0.0981 0.0 

-0.0490 -2.5390 1.0 0.0 -0.0854 

A 0 = -0.0730 19.3200 -2.2700 0.0 22.6834 , 

0.0490 2.5390 0.0 0.0 0.0854 

0.0 0.0 0.0 0.0 20.0 

' -0.2317 0.0 0.0 0.0 0.0 

-1.2760 -0.7922 0.0 0.0 0.0206 

A nl = 0.1020 64.2940 -13.9710 0.0 -5.4167 , 

1.2760 0.7922 0.0 0.0 -0.0206 

0.0 0.0 0.0 0.0 0.0 
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( 41 ) 


By = [ 0.0 0.0 0.0 0.0 20.0 

b 2 = [ 0.0 0.0 0.0 0.0 2.0 ’ 


The cost functional to be minimized (also modified from that used in [15]) is 


J(x o, u) 


i r°° / \ 

- J (x T Qx + u 2 J dt. 


where the state weight matrix is 


Q = 


100 0 0 0 0 
0 10 0 0 0 
0 0 10 0 0 
0 0 0 100 0 
0 0 0 0 10 


(42) 


(43) 


The SDRE control techniques were applied to this example with the initial state chosen 
as x 0 = [0 25(7t/ 180) 0 0 0] T , a large angle of attack which must be brought back 
to trim, as efficiently as possible with respect to J. The main variations of the SDRE 
tested were “cheap control” and standard formulations of the dynamic system, and power 
series approximation (with varying number of terms) and online-evaluation approaches to 
solving the SDRE. The power series method was applied using one, two, and three power 
series terms. Also, the non-cheap power series method was implemented with u found both 
through an iterative solution of equation (28) for u , and a polynomial evaluation of equation 
(27), as described in Section 3. In the polynomial evaluation, the choice of root was that 
with the smallest absolute value. 

As mentioned in Section 4, some nonlinear- w systems have the property that B(u ) = 0 at 
certain nonzero values of u, rendering the system uncontrollable at those points and creating 
the potential for problems in some of the algorithms. This feature is illustrated for the 
aircraft dynamics model in two figures showing the control update mapping u n+ i = F(u n ) 
representing the evaluation of equations (31-32) - or a single pass through the iterative 
scheme (33-34). Figure 1 evaluates this mapping at state x = [0 25(-7r/180) 0 0 0] T , with 
the discontinuity present at u n = — 10 (the point at which B = 0). At this 25° angle of 
attack the discontinuity still allows a solution to the update equations (33-34) - that is, a 
control value where the iterative process has converged so that F{u n ) = u n . (In fact there 
are two possible solutions, which is an interesting problem but still allows for the system to 
be controlled.) However, if the same update mapping is examined at a 35° angle of attack, 
the influence of the discontinuity is broader and there is no solution to the equations, as 
Figure 2 shows. In that case the iterative algorithm in equations (33-34) will not be able to 
converge, and it will be nearly impossible for the algorithm to control the system successfully. 

In contrast with the success of equations (33-34) at a 25° angle, attempts to use a Newton- 
Raphson algorithm in the full nonlinear online-update method (35-36) and the combined 
gain/control update method (38-39) were unable to converge and find a gain matrix solution. 
The inclusion of the more complex dynamics and/or the less specific numerical algorithm 
make the solution of this system of equations much more difficult. The approaches using the 
Newton-Raphson algorithm were discarded as a result of this in favor of the methods more 
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closely related to the Riccati equation (and implemented using a built-in Matlab Riccati 
equation solver), which appear to be significantly more robust. 

The cost values resulting from each of these methods are given in Table 1, as is the result 
of a linear Riccati control found by linearizing the system about x = 0, u = 0. The most 

Table 1: Cost functional values for SDRE control algorithms on Example One. 


Method 

Cost 

Cost with Noise 

Cheap Power Series (1 term) 

5.448 

5.672 

Cheap Power Series (2 term) 

5.578 

5.809 

Cheap Power Series (3 term) 

5.580 

5.797 

Plain Power Series (1 term, iterative) 

4.426 

4.547 

Plain Power Series (2 term, iterative) 

4.470 

4.580 

Plain Power Series (3 term, iterative) 

4.479 

4.562 

Plain Power Series (1 term, polynomial) 

4.426 

4.540 

Plain Power Series (2 term, polynomial) 

4.469 

4.615 

Plain Power Series (3 term, polynomial) 

4.479 

4.588 

Cheap Online Evaluation 

4.460 

4.605 

Plain Online Evaluation 

4.585 

4.675 

Linearized Control 

4.641 

4.807 


efficient method is the standard form power series control, with the cheap online-evaluation 
control having a cost only very slightly higher than that. The standard form version of 
the online-evaluation control is also fairly close, while the cheap power series control is 
significantly off from the other three variations and the linear control. All show a clear 
ability to stabilize the aircraft from an initial 25° angle of attack. Changing the number of 
terms used in the power series methods seems to have only a small effect in this case, and 
the polynomial and iterative solutions for the standard form power series algorithm prove to 
be essentially identical here. 

In contrast with the power series method, in the online-evaluation control the alteration 
to use cheap control increases the effectiveness of the algorithm on this example problem. 
The reason may be that the controllability issues are minimized through this modification of 
the system. In contrast, the power series method was not expected to be affected as much 
by the controllability problem. Thus the addition of cheap control would not be expected 
to help much, and it may have detracted from the method’s effectiveness due to it finding u 
values based on a system slightly different from the original. 

For this example a good choice of weight in the cheap control cost functional (30) was 
R = 10 -4 . Other values were less effective, as larger ones apparently moved the cheap 
control system too far from the original problem, while smaller ones apparently created 
problems with FT 1 blowing up. 

The norm of the state variables, x T Qx 1 is plotted in Figure 3 for the four main versions 
of the SDRE algorithm. The most efficient of the versions with different numbers of power 
series terms and calculation methods are plotted: cheap control with one term, and standard 
control with one term and the polynomial solution. The two online-evaluation control results 
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Figure 3: Comparison of state norms in Example One with several control methods. 


are shown as well. The control inputs calculated by these same methods are plotted in Figure 
4. The values of the control u remain greater than the problematic value of -10 where the 
system is uncontrollable. Note that these figures (as well as later ones) display only the first 
few seconds of the state and control trajectories. The simulations themselves were carried 
out through an interval of 10 seconds to allow for more complete convergence, but since the 
important dynamics occur in the early time interval, that is the region shown for clarity. 

The various control methods were also applied to the same model with a uniform random 
noise source \S(t)\ < 2.5(7t/ 180), ten percent of the initial nonzero state, added to the 
dynamics to see how the methods would compensate for this in controlling the problem. In 
this example the results are very similar to those without noise, with the methods’ costs 
falling in the same order and the plain form of the power series method having the lowest 
cost overall. In each case the cost has increased by a noticeable amount but the system is 
stabilized without too much difficulty, as the divergence due to noise in this case seems to 
have made the problem only slightly harder to control. These costs are listed in Table 1 
along with the noise-free case. 

Simulations using one selection from each class of control algorithms were run with larger 
initial angles of attack, to see which of the methods would prove effective at stabilizing the 
system over a larger range of conditions. The three one-term power series methods were 
used, as were the two online-evaluation methods. The same process was followed as described 
above, with initial angles of 35°, 45°, and 55°. The resulting cost functional values are given 
in Table 2. 

The various algorithms drop out one by one as the initial angle becomes larger and the 
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Figure 4: 


Comparison of control inputs for Example One with several control methods. 


Table 2: Cost functional values with larger initial angles in Example One. 


Method 

Cost 

Initial Angle 35° 

Cheap Power Series (1 term) 

12.064 

Plain Power Series (1 term, iterative) 

(fails) 

Plain Power Series (1 term, polynomial) 

10.678 

Cheap Online Evaluation 

11.160 

Plain Online Evaluation 

(fails) 

Linearized Control 

11.370 

Initial Angle 45° 

Cheap Power Series (1 term) 

22.644 

Plain Power Series (1 term, polynomial) 

25.036 

Cheap Online Evaluation 

(fails) 

Linearized Control 

(fails) 

Initial Angle 55° 

Cheap Power Series (1 term) 

38.550 

Plain Power Series (1 term, polynomial) 

(fails) 


problem becomes more difficult to control. More than a mere increase in the magnitude of 
the state value to be stabilized, there is something of a qualitative change in the scenario. As 
noted above, at a 35° angle of attack the online-evaluation iterative control update becomes 
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unsolvable (while at 25° it has a findable solution). The other algorithms may also be 
affected to some degree by the element of uncontrollability in the problem as the range of 
angles considered expands. The control values involved in some algorithms may approach 
the danger point of -10 and as a result fail to converge to an effective control. 

The two standard power series approaches had been the most effective at stabilizing the 
system from a 25° angle and had virtually identical results in that case. In these additional 
simulations, as the system gets more difficult to control, the iterative version of the power 
series fails at 35°, but the polynomial version is adaptable to a much larger range. It is 
still successful at 45°, though the cheap power series control has overtaken it in terms of 
cost. Eventually, at 55° only the cheap power series algorithm can successfully control the 
system. Thus the least efficient of the methods at small angles is the one able to stabilize 
this example over the widest range of conditions using its slow-but-steady type of control. 
This can be seen in Figure 3 and also in Figure 5, which plots the state norms for the higher 
35° angle. Also note that as the angle increases, so does the size of the initial spike of 



Figure 5: State norms in Example One for a 35° initial angle. 


activity, as the algorithms seek to bring the angle down quickly. This spike is much greater 
for the other methods than for the cheap power series, making it unsurprising that they 
would become unstable sooner at high angles while the cheap power series method is able to 
continue operating in that range. 

6 EXAMPLE PROBLEM TWO 

The second example problem used to test the control algorithms was taken from Wernli 
and Cook [4], in which the use of power series approximations was proposed for finding a 
gain matrix from the state-dependent Riccati equation. That approach was applied to the 
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following system, and here we use our versions of the SDRE method on it. This is a small 
two-state system with one control: 


xi = x 2 (44) 

x 2 = x^l + ^1 + \/\xi\^ (2m -I- w 2 ) , (45) 

which the control methods will try to stabilize in such a way that the cost functional 

1 r°° / \ 

J = - ( 2x\ + x\ + u 2 j dt (46) 

2 Jo 


is minimized. 

The quasi-linearization chosen by Wernli and Cook to set up the SDRE form is: 


Xi 


0 1 ' 

Xi 


0 

. . 


_x\ 0 _ 

. X2 . 

+ 

(l + y/xfj (2 + u) 


(47) 


Using this form of the system the various SDRE methods described in this paper are imple- 
mented in an attempt to control the system from three initial states: x 0 = [0.5 0.0] T , x 0 
= [1.0 0.0] T , and x 0 = [0.5 0.5] T , the three states used by Wernli and Cook in their paper. 
As with Example One, there is a danger of possible uncontrollability hindering stabilization 
of the system, in this case near the point u = —2 where B(u ) = [0 0] T . 

The cost functional results are presented in Table 3 for the three chosen initial conditions 
and the various control techniques. The best values in [4] (found using a standard-form 


Table 3: Cost functional values for SDRE control algorithms on Example Two. 


Method 

Initial Conditions 

[0.5 0.0] 

[1.0 0.0] 

[0.5 0.5] 

[0.5 0.5] w/Noise 

Cheap Power Series (1 term) 

0.2593 

1.0322 

0.4076 

0.4114 

Cheap Power Series (2 term) 

0.2724 

1.0891 

0.4511 

0.4566 

Cheap Power Series (3 term) 

0.2714 

1.1115 

0.4236 

0.4289 

Plain Power Series (1 term, iter) 

0.2669 

(fails) 

0.6671 

0.6604 

Plain Power Series (2 term, iter) 

(fails) 

(fails) 

(fails) 

(fails) 

Plain Power Series (3 term, iter) 

0.2591 

(fails) 

0.7566 

0.7455 

Plain Power Series (1 term, poly) 

0.2669 

(fails) 

(fails) 

(fails) 

Plain Power Series (2 term, poly) 

0.2594 

(fails) 

0.5339 

0.5363 

Plain Power Series (3 term, poly) 

(fails) 

(fails) 

(fails) 

(fails) 

Cheap Online Evaluation 

(fails) 

(fails) 

(fails) 

(fails) 

Plain Online Evaluation 

0.2673 

1.1905 

0.9214 

(fails) 

Linearized Control 

0.2580 

1.1297 

0.7069 

0.7094 


power series method) are: for x 0 = [0.5 0.0] T , J = 0.256; for x 0 = [1.0 0.0] T , J = 1.11; 
for x 0 = [0.5 0.5] t , J = 0.604. In this example the cheap power series form of the SDRE 
method is most efficient at controlling the system. It results in cost values which are almost 
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the same as (for [0.5 0.0] T ) or less than (for the other two conditions) the values found in 
[4] or with any of the other algorithms here. 

The controls generally have a more difficult time in stabilizing this example than they 
did with Example One, as some methods fail to control the system, especially at the larger 
initial condition [1.0 0.0] T . The cheap online-evaluation control does not succeed with any 
of the three initial states, and the standard power series controls only succeed intermittently 
for various states and numbers of series terms. The standard online-evaluation control does 
stabilize all three cases. However, though it is fairly close to the cost of the cheap power 
series control for initial conditions of [0.5 0.0] T and [1.0 0.0] T , it is much larger for [0.5 
0.5] t . Control of this nonzero-x 2 condition seems to be more complex, as the cost functional 
values are much more widely spread in comparison to the smaller variation between methods 
with the other two initial conditions. This can also be seen in the linearized-control results, 
which are very good for [0.5 0.0] T and [1.0 0.0] T but fall off for the more complex case of 
[0.5 0.5] t . In the cheap power series control, the one-term version is the most efficient. 

The state variables for selected methods are plotted in Figure 6 for initial state [0.5 0.5] T . 
The algorithms plotted are the cheap power series control with one term, the standard power 



Figure 6: Comparison of state variables in Example Two with several control methods. 


series control with two terms (and polynomial solution), and the standard online-evaluation 
control. The control inputs for these same methods are presented in Figure 7. It can 
be seen that each of these successful controls stay away from the u = — 2 value where the 
system becomes uncontrollable (though the standard online-evaluation control comes very 
close). The methods which fail to stabilize the system may be encountering difficulty with 
that aspect of the problem when trying to exert a more powerful control influence. 

A set of simulations was done with initial condition [0.5 0.5] T , and the addition of 
uniform noise \S(t)\ < 0.05. The results, given in Table 3, align closely with those of that 
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0 1 2 3 4 5 6 

Time 

Figure 7: Comparison of control inputs for Example Two with several control methods. 


scenario without noise, with the exception that the standard online-evaluation control fails 
in the case with noise. The cheap power series method (using one term) is still the most 
efficient. 

7 EXAMPLE PROBLEM THREE 

A third example, taken from a paper by Lin [16], describes an angular momentum equation 
for a rigid body with non-affine control inputs. The system contains three states and two 
controls, including both linear and highly nonlinear control input terms: 


X 1 

= -X 2 X 3 + U 1 

(48) 

X2 

= 2x 1 x 2 x 3 + u 2 

(49) 

x 3 

= X\X 2 + X\X 3 2 (1 — COSX 3 ) — l) . 

(50) 


Lin’s control approach is stability-based rather than performance-based using a cost func- 
tional, so for our purposes we introduce one, given by 

J = ^ J (x\ + x\ + ICteg + u\ + dt. (51) 

In quasi-linearizing this example, there is no controllability issue as in the first two 
examples where B(u) = 0 at some nonzero u. However, the control term in the x 3 equation 
is notably troublesome due to its extreme nonlinearity. Control inputs with magnitude 
greater than 1 can have very large effects on the system dynamics. We write the system in 
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the desired SDRE form as: 


’ Xi ' 

0 

~x 3 /2 

~x 2 /2 

’ X 1 1 r n 


X 2 

= 2x 2 x 3 /3 

2xiX 3 /3 

2xiX 2 /3 

x 2 +B Ul . 

n I 

(52) 

. ^3 . 

x 2 /2 

Xi/2 

0 

Uj 2 

L x s j 1 J 



The choice of B is a difficult one, since the x 3 control term is not a pure polynomial which 
could be easily split up. One approach is to modify this term by multiplying and dividing 
it by ui to obtain 

1 0 ' 

B= 0 1 , (53) 

xixl(l — cosx 3 ) ( e^ UlU2 ^ — lj ju\ 0 

but this has an inherent problem in calculating B when iq approaches 0. Other versions 
which were tried include a similar formulation in terms of u 2 , a formulation which uses both 
- splitting that term from (50) into two terms similarly to the way the state nonlinearities 
are split in (52) - and a formulation which switches between the two forms, keeping the u* 
value farthest from zero as the one by which the original term is divided. 

However, none of these remove the difficulty entirely, but merely superficially modify it 
to try to lessen its impact. We can also alter the problem more significantly, creating an 
approximation to the actual formula for use in finding the control, and then implementing 
that control back on the original system. With this example this was done by writing the 
exponential term as a power series, 

e ( “i“ 2 ) 2 = 1 + (uiu 2 ) 2 + ^(wiw 2 ) 4 + ^(wiih?) 6 --- (54) 

and using only the first two or three terms to approximate it. Substituting the truncated 
series into the system results in a two-term approximation of 

1 O' 

B = 0 1 (55) 

Xix\{l — COSX 3 )uiu\/2 Xix\{l — COSX 3 )u\u 2 / 2 
or a three-term approximation of 

1 0 

B= 0 1 (56) 

x\x\(l — cosx 3 )(uiul + u\u\/2)/2 x\x\(l — cosx 3 )(u\u 2 + u\ul/2)/2 

(In these formulations the control term has been split into multiple parts, as the state 
dynamics terms have been in equation (52), to spread the effects further over the A and B 
matrices.) This alteration of the problem creates a different approach which might be more 
stable than the volatile original form. 

Simulations were run using an initial state of x = [1 -1 - 1 ] T , with each of the control 
algorithm variations. Due to the type of control nonlinearity here having different imple- 
mentation and controllability issues, as described above, the results were quite different from 
the previous examples. All of the non-cheap control algorithms fail to stabilize the system, 
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while the cheap control methods have more success, probably due to the movement of the 
exponential from the control to the state dynamics reducing its troubling effects. Trying to 
control the system using the exponential nonlinearity as is proved to be mostly ineffective, 
while the use of a power series approximation for that nonlinearity in finding the control 
improved things significantly. The cost functional values for the successful methods are 
given in Table 4, also noting the specific form of exponential used in each method. There 


Table 4: Cost functional values for SDRE control algorithms on Example Three. 


Method 

Cost 

Cost with Noise 

Cheap Power Series (1 term, u 2 form) 

32.33 

(fails) 

Cheap Power Series (2 terms, u 2 form) 

42.06 

(fails) 

Cheap Power Series (3 terms, u 2 form) 

61.17 

(fails) 

Cheap Power Series (1 term, 1-term form) 

32.88 

32.44 

Cheap Power Series (2 terms, 1-term form) 

39.79 

48.85 

Cheap Power Series (3 terms, 1-term form) 

29.33 

31.03 

Cheap Power Series (1 term, 2-term form) 

34.47 

34.19 

Cheap Power Series (2 terms, 2-term form) 

28.15 

(fails) 

Cheap Power Series (3 terms, 2-term form) 

28.37 

29.31 

Cheap Online Evaluation (1-term form) 

8.87 

8.80 

Cheap Online Evaluation (2-term form) 

6.91 

8.77 


are no linearized-control results in this example because the extremely nonlinear nature of 
it makes impossible the finding of a viable control through linearization as in the previous 
sections. 

The following versions of the cheap power series method have the norm of their resulting 
state vectors plotted in Figure 8: that using the u 2 form of the exponential term (and one 
term in the SDRE equation power series expansion), the one-term exponential expansion 
(with three SDRE series terms), and the two-term exponential expansion (with two SDRE 
series terms). Also plotted are the cheap online-evaluation methods with one and two 
exponential expansion terms. The control inputs for these methods are shown in Figure 9. 

Although there is some variation between the different implementations of the cheap 
power series method, the cheap online-evaluation methods produce dramatically more effi- 
cient results that any of them. The power series algorithms seem to have a pause of sorts 
in the control, as the way those particular methods interact with this example’s dynamics 
seemingly leads to the control influence being applied gradually, with some of the major 
effects not appearing until three or four seconds into the simulation. In contrast, the online- 
evaluation controls have a bigger initial burst of control influence, which leads to the system 
being largely stabilized after only one or two seconds. Of the different approaches to treating 
the exponential control nonlinearity, the two-term power series approximation was the most 
effective for both the SDRE power series and online-evaluation algorithms. Strangely the 
best choice for number of terms in the power series representing the SDRE solution varied, 
with a different number producing the lowest cost for each different way of dealing with the 
exponential. The success of the online-evaluation method may be partly a consequence of 
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Figure 8: Comparison of state variables in Example Three with several control methods. 



Figure 9: Comparison of control inputs for Example Three with several control methods. 
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this example not containing a controllability danger as the previous two did, making it easier 
for that version of the SDRE method to construct a strong control using large initial control 
input values. 

As in the previous two examples, these simulations are repeated with an added uniform 
noise, \5(t) \ < 0.1, in the system dynamics. The cost values for these simulations are given 
in Table 4 as well. The online-evaluation controls again perform much better than the power 
series controls. There are changes among the different methods, though; the u 2 form of 
the exponential now fails to produce a viable control, and the cheap power series controls 
with two terms representing the SDRE both are much less effective than they had been 
without the noise. Though the form of the online-evaluation control with two terms for the 
exponential is still the most effective, it also is reduced somewhat in comparison with the 
one-term version which handles the noise easily. 

8 CONCLUSIONS 

The test problems studied, each nonlinear in both the state and control, were each success- 
fully stabilized by versions of the state-dependent Riccati equation based feedback control 
method. The power series method, with a cheap control formulation moving the nonlin- 
earities all into the state dynamics, was the most consistent method overall, successfully 
controlling all three test problems. The cheap control version of the online SDRE evalua- 
tion method was much more efficient than the others at controlling Example Three, but was 
vulnerable to failure with the other two examples. 

In light of these results no one specific method appears to be an obvious best choice. For 
problems which may have difficulties with controllability at certain points, these simulations 
suggest that the more stable cheap power series version of the SDRE method is probably best, 
while for problems without such a concern, the cheap online-evaluation version might be more 
efficient. The forming of a cheap control system to move any control nonlinearities into the 
state dynamics seems to be a quite helpful tool in preparing the system for SDRE regulation. 
The different SDRE control variations should be kept in mind when approaching the control 
of a nonlinear system of the type studied here. Additional options are also contained in 
these methods, such as the number of terms to use in the power series approximating the 
SDRE gain matrix, or the way to write a certain example in the quasi-linearized SDRE form 
(including whether to make alterations to the system as was done in Example Three). These 
options can be used as seems appropriate for specific applications, as their effects will vary 
from system to system. 

Despite these unanswered questions, the SDRE-based methods performed well. Fu- 
ture research plans in this area include the application of this control technique to higher- 
dimensional systems than those discussed in this paper, as well as possible work on using 
an SDRE-based method in combination with other control methods, similar to that in [17]. 
The goal for that investigation would be to combine methods in such a way that the speed 
and efficiency of the optimality-theory-based SDRE algorithms are largely retained while 
the stability and robustness properties of a different algorithm are merged into it as well, so 
that the strong control performance qualities noted in the examples above can be extended 
with confidence over a much broader domain. 
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